A novel oversampling and feature selection hybrid algorithm for imbalanced data classification

نویسندگان

چکیده

Traditional approaches tend to cause classier bias in the imbalanced data set, resulting poor classification performance for minority classes. In particular, there are many financial fraud, network intrusion, and fault detection, where recognition rate of classes is pertinent than majority Therefore, pressure on developing efficient algorithms solve class imbalance problem. To this end, article presents a novel hybrid algorithm Negative Binary General (NBG), improve classifications by combining oversampling feature selection algorithm. A algorithm, Negative-positive Synthetic Minority Oversampling Technique (NPSMOTE), improves sample generation’s practicability while Ant Lion Optimizer (BALO) extracts most significant features performance. Simulation experiments carried out using seven benchmark sets demonstrate that, proposed NBG significantly outperforms small-sample compared nine other existing six recently published algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel One Sided Feature Selection Method for Imbalanced Text Classification

The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...

متن کامل

H-BwoaSvm: A Hybrid Model for Classification and Feature Selection of Mammography Screening Behavior Data

Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation for participating in mammography screening programs. Todays, intelligence systems could...

متن کامل

Adaptive Oversampling for Imbalanced Data Classification

Data imbalance is known to significantly hinder the generalization performance of supervised learning algorithms. A common strategy to overcome this challenge is synthetic oversampling, where synthetic minority class examples are generated to balance the distribution between the examples of the majority and minority classes. We present a novel adaptive oversampling algorithm, VIRTUAL, that comb...

متن کامل

A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification

In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...

متن کامل

Oversampling Method for Imbalanced Classification

Classification problem for imbalanced datasets is pervasive in a lot of data mining domains. Imbalanced classification has been a hot topic in the academic community. From data level to algorithm level, a lot of solutions have been proposed to tackle the problems resulted from imbalanced datasets. SMOTE is the most popular data-level method and a lot of derivations based on it are developed to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Multimedia Tools and Applications

سال: 2022

ISSN: ['1380-7501', '1573-7721']

DOI: https://doi.org/10.1007/s11042-022-13240-0